Strategies for the effective identification of remotely related sequences in multiple PSSM search approach.
نویسندگان
چکیده
Searches using position specific scoring matrices (PSSMs) have been commonly used in remote homology detection procedures such as PSI-BLAST and RPS-BLAST. A PSSM is generated typically using one of the sequences of a family as the reference sequence. In the case of PSI-BLAST searches the reference sequence is same as the query. Recently we have shown that searches against the database of multiple family-profiles, with each one of the members of the family used as a reference sequence, are more effective than searches against the classical database of single family-profiles. Despite relatively a better overall performance when compared with common sequence-profile matching procedures, searches against the multiple family-profiles database result in a few false positives and false negatives. Here we show that profile length and divergence of sequences used in the construction of a PSSM have major influence on the performance of multiple profile based search approach. We also identify that a simple parameter defined by the number of PSSMs corresponding to a family that is hit, for a query, divided by the total number of PSSMs in the family can distinguish effectively the true positives from the false positives in the multiple profiles search approach.
منابع مشابه
Designing Of Degenerate Primers-Based Polymerase Chain Reaction (PCR) For Amplification Of WD40 Repeat-Containing Proteins Using Local Allignment Search Method
Degenerate primers-based polymerase chain reaction (PCR) are commonly used for isolation of unidentified gene sequences in related organisms. For designing the degenerate primers, we propose the use of local alignment search method for searching the conserved regions long enough to design an acceptable primer pair. To test this method, a WD40 repeat-containing domain protein from Beauveria bass...
متن کاملIdentification and Prioritizing Consequences and Strategies of Effective Crisis Management in Iranian Sports Stadiums
Introduction: The aim of the present study is identification and prioritizing consequences and strategies of effective crisis management in Iranian sports stadiums. Method: The research method was mixed. The qualitative data collection tool was through in-depth interviews. After conducting19 interviews with experts, the codes reached theoretical saturation. Data analysis was performed simultan...
متن کاملQuery-seeded iterative sequence similarity searching improves selectivity 5–20-fold
Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has bee...
متن کاملThe analysis of hazard identification and risk assessment studies with the approach to assessing risk control measures since 2001 to 2017: A systemic review
Abstract background and aims: Nowadays the growing complexity of technology and industry has led to vast changes over the last few decades. These changes, in addition to their positive and valuable effects, have also caused industrial accidents affecting human life and the environment. According to the ILO 2011 report, there are 340 million annual workplace accidents and 160 million occupation...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proteins
دوره 67 4 شماره
صفحات -
تاریخ انتشار 2007